智能论文笔记

A Modular Continuum Manipulator for Aerial Manipulation and Perching

Qianwen Zhao , Guoqing Zhang , Hamidreza Jafarnejadsani , Long Wang

分类：机器人

2022-06-13

大多数空中操纵器都使用串行刚性链接设计，在操纵过程中启动接触时会导致大力，并可能导致飞行稳定性难度。连续操作器的遵守情况可能会改善这种限制。为了实现这一目标，我们介绍了空中无人机的紧凑，轻巧和模块化电缆驱动的连续操作的新颖设计。然后，我们为其运动学，静电和刚度（合规性）得出一个完整的建模框架。该框架对于将操纵器集成到空中无人机至关重要。最后，我们报告了硬件原型的初步实验验证，从而提供了有关其操纵可行性的见解。未来的工作包括对拟议的连续操作机与空中无人机的集成和测试。

translated by 谷歌翻译

MHCCL: Masked Hierarchical Cluster-wise Contrastive Learning for Multivariate Time Series

Qianwen Meng , Hangwei Qian , Yong Liu , Yonghui Xu , Zhiqi Shen , Lizhen Cui

分类：机器学习 | 人工智能

2022-12-02

Learning semantic-rich representations from raw unlabeled time series data is critical for downstream tasks such as classification and forecasting. Contrastive learning has recently shown its promising representation learning capability in the absence of expert annotations. However, existing contrastive approaches generally treat each instance independently, which leads to false negative pairs that share the same semantics. To tackle this problem, we propose MHCCL, a Masked Hierarchical Cluster-wise Contrastive Learning model, which exploits semantic information obtained from the hierarchical structure consisting of multiple latent partitions for multivariate time series. Motivated by the observation that fine-grained clustering preserves higher purity while coarse-grained one reflects higher-level semantics, we propose a novel downward masking strategy to filter out fake negatives and supplement positives by incorporating the multi-granularity information from the clustering hierarchy. In addition, a novel upward masking strategy is designed in MHCCL to remove outliers of clusters at each partition to refine prototypes, which helps speed up the hierarchical clustering process and improves the clustering quality. We conduct experimental evaluations on seven widely-used multivariate time series datasets. The results demonstrate the superiority of MHCCL over the state-of-the-art approaches for unsupervised time series representation learning.

translated by 谷歌翻译

Speaker-Guided Encoder-Decoder Framework for Emotion Recognition in Conversation

Yinan Bao , Qianwen Ma , Lingwei Wei , Wei Zhou , Songlin Hu

分类：自然语言处理

2022-06-07

对话（ERC）任务中的情感识别旨在预测对话中话语的情感标签。由于说话者之间的依赖性是复杂而动态的，这包括言论和言论者间的依赖性，因此说话者特定信息的建模是ERC中的至关重要的作用。尽管现有的研究人员提出了各种说话者互动建模的方法，但他们不能共同探索动态的言论和言论者的依赖性，从而导致对上下文的理解不足并进一步阻碍情绪预测。为此，我们设计了一种新颖的扬声器建模方案，该方案以动态方式共同探索言论和言论者的依赖性。此外，我们为ERC提出了一个演讲者引导的编码编码器（SGED）框架，该框架完全利用了说话者信息来解码情感。我们使用不同的现有方法作为我们框架的对话上下文编码器，显示了提出的框架的高扩展性和灵活性。实验结果证明了SGED的优势和有效性。

translated by 谷歌翻译

Multi-Granularity Semantic Aware Graph Model for Reducing Position Bias in Emotion-Cause Pair Extraction

Yinan Bao , Qianwen Ma , Lingwei Wei , Wei Zhou , Songlin Hu

分类：自然语言处理

2022-05-04

情绪原因对提取（ECPE）任务旨在从文档中提取情绪和原因。我们观察到，在典型的ECPE数据集中，情绪和原因的相对距离分布极为不平衡。现有方法设置了一个固定的大小窗口，以捕获相邻子句之间的关系。但是，他们忽略了遥远条款之间的有效语义联系，从而导致对位置不敏感数据的概括能力差。为了减轻问题，我们提出了一种新型的多晶格语义意识图模型（MGSAG），以共同结合细粒度和粗粒语义特征，而无需距离限制。特别是，我们首先探讨从子句和从文档中提取的关键字之间的语义依赖性，这些文档传达了细颗粒的语义特征，从而获得了关键字增强子句表示。此外，还建立了子句图，以模拟条款之间的粗粒语义关系。实验结果表明，MGSAG超过了现有的最新ECPE模型。特别是，MGSAG在不敏感数据的条件下大大优于其他模型。

translated by 谷歌翻译

EfficientFi: Towards Large-Scale Lightweight WiFi Sensing via CSI Compression

Jianfei Yang , Xinyan Chen , Han Zou , Dazhuo Wang , Qianwen Xu , Lihua Xie

分类：人工智能

2022-04-08

由于高速互联网访问的要求增加，WiFi技术已应用于各个地方。最近，除了网络服务之外，WiFi Sensing在智能家居中还具有吸引力，因为它是无设备，具有成本效益和隐私性的。尽管已经开发了许多WiFi传感方法，但其中大多数仅考虑单个智能家庭场景。没有强大的云服务器和大量用户的连接，大规模的WiFi感应仍然很困难。在本文中，我们首先分析和总结了这些障碍，并提出了一个有效的大规模WiFi传感框架，即有效的障碍。 EfficityFI与中心服务器处的WiFi APS和云计算一起使用Edge Computing。它由一个新颖的深神经网络组成，该网络可以在Edge处压缩细粒的WiFi通道状态信息（CSI），在云中恢复CSI，并同时执行感应任务。量化的自动编码器和联合分类器旨在以端到端的方式实现这些目标。据我们所知，EfficityFi是第一个启用IoT-Cloud WiFi传感框架，可大大减少开销的交流，同时准确地实现感应任务。我们通过WiFi传感利用人类活动识别和鉴定为两个案例研究，并进行了广泛的实验以评估有效性。结果表明，它将CSI数据从1.368MB/s压缩至0.768kb/s，数据重建的误差极低，并且可以达到超过98％的人类活动识别精度。

translated by 谷歌翻译

TCM-SD: A Benchmark for Probing Syndrome Differentiation via Natural Language Processing

Mucheng Ren , Heyan Huang , Yuxiang Zhou , Qianwen Cao , Yuan Bu , Yang Gao

分类：自然语言处理 | 人工智能

2022-03-21

传统中药（TCM）是一种自然，安全且有效的疗法，已在全球范围内传播和应用。独特的TCM诊断和治疗系统需要对隐藏在自由文本编写的临床记录中的患者症状进行全面分析。先前的研究表明，该系统可以在人工智能（AI）技术（例如自然语言处理（NLP））的帮助下进行通知和智能。但是，现有数据集没有足够的质量或数量来支持TCM中数据驱动的AI技术的进一步开发。因此，在本文中，我们专注于TCM诊断和治疗系统的核心任务 - 综合征分化（SD） - 我们介绍了第一个用于SD的公共大型数据集，称为TCM-SD。我们的数据集包含54,152个现实世界临床记录，涵盖148个综合征。此外，我们在TCM领域收集了一个大规模的未标记文本语料库，并提出了一种特定领域的预训练的语言模型，称为Zy-Bert。我们使用深层神经网络进行了实验，以建立强大的性能基线，揭示了SD中的各种挑战，并证明了特定领域的预训练性语言模型的潜力。我们的研究和分析揭示了将计算机科学和语言学知识纳入探索TCM理论的经验有效性的机会。

translated by 谷歌翻译

Saliency-Aware Spatio-Temporal Artifact Detection for Compressed Video Quality Assessment

Liqun Lin , Yang Zheng , Weiling Chen , Chengdong Lan , Tiesong Zhao

分类：计算机视觉

2023-01-03

Compressed videos often exhibit visually annoying artifacts, known as Perceivable Encoding Artifacts (PEAs), which dramatically degrade video visual quality. Subjective and objective measures capable of identifying and quantifying various types of PEAs are critical in improving visual quality. In this paper, we investigate the influence of four spatial PEAs (i.e. blurring, blocking, bleeding, and ringing) and two temporal PEAs (i.e. flickering and floating) on video quality. For spatial artifacts, we propose a visual saliency model with a low computational cost and higher consistency with human visual perception. In terms of temporal artifacts, self-attention based TimeSFormer is improved to detect temporal artifacts. Based on the six types of PEAs, a quality metric called Saliency-Aware Spatio-Temporal Artifacts Measurement (SSTAM) is proposed. Experimental results demonstrate that the proposed method outperforms state-of-the-art metrics. We believe that SSTAM will be beneficial for optimizing video coding techniques.

translated by 谷歌翻译

More is Better: A Database for Spontaneous Micro-Expression with High Frame Rates

Sirui Zhao , Huaying Tang , Xinglong Mao , Shifeng Liu , Hanqing Tao , Hao Wang , Tong Xu , Enhong Chen

分类：计算机视觉

2023-01-03

As one of the most important psychic stress reactions, micro-expressions (MEs), are spontaneous and transient facial expressions that can reveal the genuine emotions of human beings. Thus, recognizing MEs (MER) automatically is becoming increasingly crucial in the field of affective computing, and provides essential technical support in lie detection, psychological analysis and other areas. However, the lack of abundant ME data seriously restricts the development of cutting-edge data-driven MER models. Despite the recent efforts of several spontaneous ME datasets to alleviate this problem, it is still a tiny amount of work. To solve the problem of ME data hunger, we construct a dynamic spontaneous ME dataset with the largest current ME data scale, called DFME (Dynamic Facial Micro-expressions), which includes 7,526 well-labeled ME videos induced by 671 participants and annotated by more than 20 annotators throughout three years. Afterwards, we adopt four classical spatiotemporal feature learning models on DFME to perform MER experiments to objectively verify the validity of DFME dataset. In addition, we explore different solutions to the class imbalance and key-frame sequence sampling problems in dynamic MER respectively on DFME, so as to provide a valuable reference for future research. The comprehensive experimental results show that our DFME dataset can facilitate the research of automatic MER, and provide a new benchmark for MER. DFME will be published via https://mea-lab-421.github.io.

translated by 谷歌翻译

Surveillance Face Anti-spoofing

Hao Fang , Ajian Liu , Jun Wan , Sergio Escalera , Chenxu Zhao , Xu Zhang , Stan Z. Li , Zhen Lei

分类：计算机视觉

2023-01-03

Face Anti-spoofing (FAS) is essential to secure face recognition systems from various physical attacks. However, recent research generally focuses on short-distance applications (i.e., phone unlocking) while lacking consideration of long-distance scenes (i.e., surveillance security checks). In order to promote relevant research and fill this gap in the community, we collect a large-scale Surveillance High-Fidelity Mask (SuHiFiMask) dataset captured under 40 surveillance scenes, which has 101 subjects from different age groups with 232 3D attacks (high-fidelity masks), 200 2D attacks (posters, portraits, and screens), and 2 adversarial attacks. In this scene, low image resolution and noise interference are new challenges faced in surveillance FAS. Together with the SuHiFiMask dataset, we propose a Contrastive Quality-Invariance Learning (CQIL) network to alleviate the performance degradation caused by image quality from three aspects: (1) An Image Quality Variable module (IQV) is introduced to recover image information associated with discrimination by combining the super-resolution network. (2) Using generated sample pairs to simulate quality variance distributions to help contrastive learning strategies obtain robust feature representation under quality variation. (3) A Separate Quality Network (SQN) is designed to learn discriminative features independent of image quality. Finally, a large number of experiments verify the quality of the SuHiFiMask dataset and the superiority of the proposed CQIL.

translated by 谷歌翻译

EZInterviewer: To Improve Job Interview Performance with Mock Interview Generator

Mingzhe Li , Xiuying Chen , Weiheng Liao , Yang Song , Tao Zhang , Dongyan Zhao , Rui Yan

分类：自然语言处理

2023-01-03

Interview has been regarded as one of the most crucial step for recruitment. To fully prepare for the interview with the recruiters, job seekers usually practice with mock interviews between each other. However, such a mock interview with peers is generally far away from the real interview experience: the mock interviewers are not guaranteed to be professional and are not likely to behave like a real interviewer. Due to the rapid growth of online recruitment in recent years, recruiters tend to have online interviews, which makes it possible to collect real interview data from real interviewers. In this paper, we propose a novel application named EZInterviewer, which aims to learn from the online interview data and provides mock interview services to the job seekers. The task is challenging in two ways: (1) the interview data are now available but still of low-resource; (2) to generate meaningful and relevant interview dialogs requires thorough understanding of both resumes and job descriptions. To address the low-resource challenge, EZInterviewer is trained on a very small set of interview dialogs. The key idea is to reduce the number of parameters that rely on interview dialogs by disentangling the knowledge selector and dialog generator so that most parameters can be trained with ungrounded dialogs as well as the resume data that are not low-resource. Evaluation results on a real-world job interview dialog dataset indicate that we achieve promising results to generate mock interviews. With the help of EZInterviewer, we hope to make mock interview practice become easier for job seekers.

translated by 谷歌翻译